听力损失是人类的重大健康问题和心理负担。小鼠模型提供了阐明参与潜在发育和病理生理机制的基因的可能性。为此,大规模的鼠标表型计划包括单基因敲除小鼠线的听觉表型。使用听觉脑干响应(ABR)程序,德国鼠标诊所和全球类似设施已经产生了大型均匀的突变体和野生型小鼠的ABR原料数据。在标准ABR分析过程中,听力阈值通过训练有素的工作人员从增加声压水平的信号曲线进行视觉评估。这是令人耗时的,并且容易被读者偏向,以及图形显示质量和规模。为了减少工作量并提高质量和再现性,我们开发并比较了两种方法,用于从平均ABR原始数据中实现自动听力阈值识别:一个受监督方法,涉及在人生成的标签和自我监督方法上训练的两个组合神经网络,利用信号功率谱利用信号功率谱并将随机森林声级估计与转换曲线拟合算法结合起来进行阈值查找。我们表明,两种型号都很好地,胜过人类阈值检测,并且适用于快速,可靠和无偏见的听力阈值检测和质量控制。在高通量鼠标表型环境中,两种方法都以自动端到端筛选管道的一部分表现良好,以检测用于听力参与的候选基因。两种模型的代码以及用于此工作的数据都可以自由使用。
translated by 谷歌翻译
基准标记通常用于导航辅助微创脊柱手术(Miss),他们帮助将图像坐标转移到现实世界坐标中。在实践中,这些标记可能位于视野(FOV)之外,由于术中手术中使用的C形臂锥形束计算机断层扫描(CBCT)系统的有限检测器尺寸。因此,CBCT体积中的重建标记遭受伪影并且具有扭曲的形状,其设定了导航的障碍。在这项工作中,我们提出了两个基准标记检测方法:直接检测从失真标记(直接方法)和标记恢复后检测(恢复方法)。为了直接检测重构体积中的失真标记,提出了一种使用两个神经网络和传统圆检测算法的有效的自动标记检测方法。对于标记恢复,提出了一种特定于任务的学习策略,以从严重截断的数据中恢复标记。之后,施加传统的标记检测算法用于位置检测。在模拟数据和实际数据上评估这两种方法,两者都可以实现小于0.2mm的标记配准误差。我们的实验表明,直接方法能够准确地检测扭曲的标记,并且具有任务特定学习的恢复方法对各种数据集具有高的鲁棒性和概括性。此外,特定于任务的学习能够准确地重建其他感兴趣的结构结构,例如,用于图像引导针活检的肋骨,来自严重截断的数据,从而使CBCT系统具有新的潜在应用。
translated by 谷歌翻译
In this paper, we present an evolved version of the Situational Graphs, which jointly models in a single optimizable factor graph, a SLAM graph, as a set of robot keyframes, containing its associated measurements and robot poses, and a 3D scene graph, as a high-level representation of the environment that encodes its different geometric elements with semantic attributes and the relational information between those elements. Our proposed S-Graphs+ is a novel four-layered factor graph that includes: (1) a keyframes layer with robot pose estimates, (2) a walls layer representing wall surfaces, (3) a rooms layer encompassing sets of wall planes, and (4) a floors layer gathering the rooms within a given floor level. The above graph is optimized in real-time to obtain a robust and accurate estimate of the robot's pose and its map, simultaneously constructing and leveraging the high-level information of the environment. To extract such high-level information, we present novel room and floor segmentation algorithms utilizing the mapped wall planes and free-space clusters. We tested S-Graphs+ on multiple datasets including, simulations of distinct indoor environments, on real datasets captured over several construction sites and office environments, and on a real public dataset of indoor office environments. S-Graphs+ outperforms relevant baselines in the majority of the datasets while extending the robot situational awareness by a four-layered scene model. Moreover, we make the algorithm available as a docker file.
translated by 谷歌翻译
Machine learning methods like neural networks are extremely successful and popular in a variety of applications, however, they come at substantial computational costs, accompanied by high energy demands. In contrast, hardware capabilities are limited and there is evidence that technology scaling is stuttering, therefore, new approaches to meet the performance demands of increasingly complex model architectures are required. As an unsafe optimization, noisy computations are more energy efficient, and given a fixed power budget also more time efficient. However, any kind of unsafe optimization requires counter measures to ensure functionally correct results. This work considers noisy computations in an abstract form, and gears to understand the implications of such noise on the accuracy of neural-network-based classifiers as an exemplary workload. We propose a methodology called "Walking Noise" that allows to assess the robustness of different layers of deep architectures by means of a so-called "midpoint noise level" metric. We then investigate the implications of additive and multiplicative noise for different classification tasks and model architectures, with and without batch normalization. While noisy training significantly increases robustness for both noise types, we observe a clear trend to increase weights and thus increase the signal-to-noise ratio for additive noise injection. For the multiplicative case, we find that some networks, with suitably simple tasks, automatically learn an internal binary representation, hence becoming extremely robust. Overall this work proposes a method to measure the layer-specific robustness and shares first insights on how networks learn to compensate injected noise, and thus, contributes to understand robustness against noisy computations.
translated by 谷歌翻译
End-to-End speech-to-speech translation (S2ST) is generally evaluated with text-based metrics. This means that generated speech has to be automatically transcribed, making the evaluation dependent on the availability and quality of automatic speech recognition (ASR) systems. In this paper, we propose a text-free evaluation metric for end-to-end S2ST, named BLASER, to avoid the dependency on ASR systems. BLASER leverages a multilingual multimodal encoder to directly encode the speech segments for source input, translation output and reference into a shared embedding space and computes a score of the translation quality that can be used as a proxy to human evaluation. To evaluate our approach, we construct training and evaluation sets from more than 40k human annotations covering seven language directions. The best results of BLASER are achieved by training with supervision from human rating scores. We show that when evaluated at the sentence level, BLASER correlates significantly better with human judgment compared to ASR-dependent metrics including ASR-SENTBLEU in all translation directions and ASR-COMET in five of them. Our analysis shows combining speech and text as inputs to BLASER does not increase the correlation with human scores, but best correlations are achieved when using speech, which motivates the goal of our research. Moreover, we show that using ASR for references is detrimental for text-based metrics.
translated by 谷歌翻译
Compressing neural network architectures is important to allow the deployment of models to embedded or mobile devices, and pruning and quantization are the major approaches to compress neural networks nowadays. Both methods benefit when compression parameters are selected specifically for each layer. Finding good combinations of compression parameters, so-called compression policies, is hard as the problem spans an exponentially large search space. Effective compression policies consider the influence of the specific hardware architecture on the used compression methods. We propose an algorithmic framework called Galen to search such policies using reinforcement learning utilizing pruning and quantization, thus providing automatic compression for neural networks. Contrary to other approaches we use inference latency measured on the target hardware device as an optimization goal. With that, the framework supports the compression of models specific to a given hardware target. We validate our approach using three different reinforcement learning agents for pruning, quantization and joint pruning and quantization. Besides proving the functionality of our approach we were able to compress a ResNet18 for CIFAR-10, on an embedded ARM processor, to 20% of the original inference latency without significant loss of accuracy. Moreover, we can demonstrate that a joint search and compression using pruning and quantization is superior to an individual search for policies using a single compression method.
translated by 谷歌翻译
Efficient localization plays a vital role in many modern applications of Unmanned Ground Vehicles (UGV) and Unmanned aerial vehicles (UAVs), which would contribute to improved control, safety, power economy, etc. The ubiquitous 5G NR (New Radio) cellular network will provide new opportunities for enhancing localization of UAVs and UGVs. In this paper, we review the radio frequency (RF) based approaches for localization. We review the RF features that can be utilized for localization and investigate the current methods suitable for Unmanned vehicles under two general categories: range-based and fingerprinting. The existing state-of-the-art literature on RF-based localization for both UAVs and UGVs is examined, and the envisioned 5G NR for localization enhancement, and the future research direction are explored.
translated by 谷歌翻译
Self-supervised image denoising techniques emerged as convenient methods that allow training denoising models without requiring ground-truth noise-free data. Existing methods usually optimize loss metrics that are calculated from multiple noisy realizations of similar images, e.g., from neighboring tomographic slices. However, those approaches fail to utilize the multiple contrasts that are routinely acquired in medical imaging modalities like MRI or dual-energy CT. In this work, we propose the new self-supervised training scheme Noise2Contrast that combines information from multiple measured image contrasts to train a denoising model. We stack denoising with domain-transfer operators to utilize the independent noise realizations of different image contrasts to derive a self-supervised loss. The trained denoising operator achieves convincing quantitative and qualitative results, outperforming state-of-the-art self-supervised methods by 4.7-11.0%/4.8-7.3% (PSNR/SSIM) on brain MRI data and by 43.6-50.5%/57.1-77.1% (PSNR/SSIM) on dual-energy CT X-ray microscopy data with respect to the noisy baseline. Our experiments on different real measured data sets indicate that Noise2Contrast training generalizes to other multi-contrast imaging modalities.
translated by 谷歌翻译
Incorporating computed tomography (CT) reconstruction operators into differentiable pipelines has proven beneficial in many applications. Such approaches usually focus on the projection data and keep the acquisition geometry fixed. However, precise knowledge of the acquisition geometry is essential for high quality reconstruction results. In this paper, the differentiable formulation of fan-beam CT reconstruction is extended to the acquisition geometry. This allows to propagate gradient information from a loss function on the reconstructed image into the geometry parameters. As a proof-of-concept experiment, this idea is applied to rigid motion compensation. The cost function is parameterized by a trained neural network which regresses an image quality metric from the motion affected reconstruction alone. Using the proposed method, we are the first to optimize such an autofocus-inspired algorithm based on analytical gradients. The algorithm achieves a reduction in MSE by 35.5 % and an improvement in SSIM by 12.6 % over the motion affected reconstruction. Next to motion compensation, we see further use cases of our differentiable method for scanner calibration or hybrid techniques employing deep models.
translated by 谷歌翻译
With the rise of AI in recent years and the increase in complexity of the models, the growing demand in computational resources is starting to pose a significant challenge. The need for higher compute power is being met with increasingly more potent accelerators and the use of large compute clusters. However, the gain in prediction accuracy from large models trained on distributed and accelerated systems comes at the price of a substantial increase in energy demand, and researchers have started questioning the environmental friendliness of such AI methods at scale. Consequently, energy efficiency plays an important role for AI model developers and infrastructure operators alike. The energy consumption of AI workloads depends on the model implementation and the utilized hardware. Therefore, accurate measurements of the power draw of AI workflows on different types of compute nodes is key to algorithmic improvements and the design of future compute clusters and hardware. To this end, we present measurements of the energy consumption of two typical applications of deep learning models on different types of compute nodes. Our results indicate that 1. deriving energy consumption directly from runtime is not accurate, but the consumption of the compute node needs to be considered regarding its composition; 2. neglecting accelerator hardware on mixed nodes results in overproportional inefficiency regarding energy consumption; 3. energy consumption of model training and inference should be considered separately - while training on GPUs outperforms all other node types regarding both runtime and energy consumption, inference on CPU nodes can be comparably efficient. One advantage of our approach is that the information on energy consumption is available to all users of the supercomputer, enabling an easy transfer to other workloads alongside a raise in user-awareness of energy consumption.
translated by 谷歌翻译